How "Deep" is Learning Word Inflection?

نویسندگان

  • Franco Alberto Cardillo
  • Marcello Ferro
  • Claudia Marzi
  • Vito Pirrelli
چکیده

English. Machine learning offers two basic strategies for morphology induction: lexical segmentation and surface word relation. The first one assumes that words can be segmented into morphemes. Inducing a novel inflected form requires identification of morphemic constituents and a strategy for their recombination. The second approach dispenses with segmentation: lexical representations form part of a network of associatively related inflected forms. Production of a novel form consists in filling in one empty node in the network. Here, we present the results of a recurrent LSTM network that learns to fill in paradigm cells of incomplete verb paradigms. Although the process is not based on morpheme segmentation, the model shows sensitivity to stem selection and stem-ending boundaries. Italiano. La letteratura offre due strategie di base per l’induzione morfologica. La prima presuppone la segmentazione delle forme lessicali in morfemi e genera parole nuove ricombinando morfemi conosciuti; la seconda si basa sulle relazioni di una forma con le altre forme del suo paradigma, e genera una parola sconosciuta riempiendo una cella vuota del paradigma. In questo articolo, presentiamo i risultati di una rete LSTM ricorrente, capace di imparare a generare nuove forme verbali a partire da forme giᅵ note non segmentate. Ciononostante, la rete acquisisce una conoscenza implicita del tema verbale e del confine con la terminazione flessionale.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Heuristical Coding of String Transformations

As a part of a study on statistical grammar learning, the word inflection is investigated in this article. The word inflection is used to create different grammatical instances of the word. In this paper, the different coding alternatives to describe the strings and the string transformations are investigated. The proposed methods are tested on a natural language, the Hungarian language.

متن کامل

Unsupervised Learning of A-Morphous Inflection with Graph Clustering

This paper presents a new approach to unsupervised learning of inflection. The problem is defined as two clusterings of the input wordlist: into lexemes and into forms. Word-Based Morphology is used to describe inflectional relations between words, which are discovered using string edit distance. A graph of morphological relations is built and clustering algorithms are used to identify lexemes....

متن کامل

A Grouping Hotel Recommender System Based on Deep Learning and Sentiment Analysis

Recommender systems are important tools for users to identify their preferred items and for businesses to improve their products and services. In recent years, the use of online services for selection and reservation of hotels have witnessed a booming growth. Customer’ reviews have replaced the word of mouth marketing, but searching hotels based on user priorities is more time-consuming. This s...

متن کامل

The Comparison of Computer Assisted Teaching and Traditional Explicit Method in Learning / Teaching English Vocabulary.

This review surveys research on second language vocabulary teaching and learning since1999. It first considers the distinction between incidental and intentional vocabulary learning.Although learners certainly acquire word knowledge incidentally while engaged in variouslanguage learning activities, more direct and systematic study of vocabulary is also required.There is a discussion of how word...

متن کامل

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

Reexamining the vocabulary spurt.

The authors asked whether there is evidence to support the existence of the vocabulary spurt, an increase in the rate of word learning that is thought to occur during the 2nd year of life. Using longitudinal data from 38 children, they modeled the rate of word learning with two functions, one with an inflection point (logistic), which would indicate a spurt, and one without an inflection point ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017